Exploring big volume sensor data with Vroom

Comments:

ABSTRACT

State of the art sensors within a single autonomous vehicle (AV) can produce video and LIDAR data at rates greater than 30 GB/hour. Unsurprisingly, even small AV research teams can accumulate tens of terabytes of sensor data from multiple trips and multiple vehicles. AV practitioners would like to extract information about specic locations or specic situations for further study, but are often unable to. Queries over AV sensor data are dierent from generic analytics or spatial queries because they demand reasoning about elds of view as well as heavy computation to extract features from scenes. In this article and demo we present Vroom, a system for ad-hoc queries over AV sensor databases. Vroom combines domain specic properties of AV datasets with selective indexing and multi-query optimization to address challenges posed by AV sensor data.

##Introduction

AV generate from high-resolution cameras, lidar and GPS at about 10 MBps.

Queries:
Q1 Compute basic statistics on recent trips such as data
rates by sensor and location coverage.
Q2 [building 3D maps] Retrieve all forward-facing video frames of the corner of Vassar and Main St. in Cambridge, MA., ordered clockwise.
Q3 [ preparing labeled training] Retrieve lidar and video readings for all cameras in the vehicle, for intervals when any vehicle camera frame shows a bicycle. Group the data by trip, and order it by timestamp within each trip.
Q4 [ preparing labeled training] Retrieve all sensor readings in the minute leading up to an interesting event, such as a possible near miss. e.g., where a vehicle’s CAN bus records a sudden brake or sharp steer, group the readings by trip and order them by timestamp within each trip.

Challenges

Computational intensity of UDFs: such as deep learning based classification
Big volumes: ad-hoc query on large historical data
Many features of interest:
Interface and storage issues:

Architecture

Sophisticated feature precomputation and indexing:
Synthesizing cheap predicates:
Memoizing:
Storage clustering, based on the workload
Multi-query optimization:
[to read] polystore data model

Exploring big volume sensor data with Vroom

Comments:

ABSTRACT

Queries:

Challenges

Architecture

System